Goto

Collaborating Authors

 moral rule


Align on the Fly: Adapting Chatbot Behavior to Established Norms

Xu, Chunpu, Chern, Steffi, Chern, Ethan, Zhang, Ge, Wang, Zekun, Liu, Ruibo, Li, Jing, Fu, Jie, Liu, Pengfei

arXiv.org Artificial Intelligence

In this paper, we aim to align large language models with the ever-changing, complex, and diverse human values (e.g., social norms) across time and locations. This presents a challenge to existing alignment techniques, such as supervised fine-tuning, which internalize values within model parameters. To overcome this, we propose an On-the-fly Preference Optimization (OPO) method, which is a real-time alignment that works in a streaming way. It employs an external memory to store established rules for alignment, which can constrain LLMs' behaviors without further training, allowing for convenient updates and customization of human values. We also introduce a scalable evaluation to assess the proposed method more effectively. Experimental results on both human-annotated and auto-generated questions from legal and moral domains indicate the effectiveness of the proposed OPO method. Our code and data are released at https://github.com/GAIR-NLP/OPO.


Morality, Machines and the Interpretation Problem: A value-based, Wittgensteinian approach to building Moral Agents

Badea, Cosmin, Artus, Gregory

arXiv.org Artificial Intelligence

We argue that the attempt to build morality into machines is subject to what we call the Interpretation problem, whereby any rule we give the machine is open to infinite interpretation in ways that we might morally disapprove of, and that the interpretation problem in Artificial Intelligence is an illustration of Wittgenstein's general claim that no rule can contain the criteria for its own application. Using games as an example, we attempt to define the structure of normative spaces and argue that any rule-following within a normative space is guided by values that are external to that space and which cannot themselves be represented as rules. In light of this problem, we analyse the types of mistakes an artificial moral agent could make and we make suggestions about how to build morality into machines by getting them to interpret the rules we give in accordance with these external values, through explicit moral reasoning and the presence of structured values, the adjustment of causal power assigned to the agent and interaction with human agents, such that the machine develops a virtuous character and the impact of the interpretation problem is minimised.


Vladimir Putin calls for set of 'moral rules' to guide interaction between humans and AI

Daily Mail - Science & tech

Vladimir Putin has called for'moral rules' on the development of artificial intelligence - urging companies'technology must not be invented for the sake of technology'. Speaking at an event on AI technology in Moscow, Russia, on Saturday, the Russian president called for safeguards, setting out rules for how humans should interact with the robots. President Putin said: 'Discussion is currently underway on social aspects and implications of the use of artificial intelligence. It is a very important issue. 'I suggest that the professional community and companies should contemplate drawing up a set of moral rules for interaction between humans and artificial intelligence.